Multiclass Classifier Building with Amazon Data to Classify Customer Reviews into Product Categories

نویسندگان

  • Yunzhen Hu
  • Te Hu
  • Haier Liu
چکیده

As one of the world’s largest online retailers, Amazon.com has a sales volume of tens of billions dollars and as a result huge amounts of customer reviews accumulate. These reviews are great learning samples which could help promote commercial prediction and advertisement. Our project aims to build a multiclass classifier based on the customer reviews which are labeled by product categories in Amazon. Different feature selections and models are tested when building this classifier. As we will show below, the random forest classifier performs the best.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing Opinion and Usefulness of Product Reviews

Assessing Opinion and Usefulness of Product Reviews Brian Percival, Ali Motamedi Abstract In this project we examined the possibility of using supervised learning techniques to classify the opinion of online product reviews that are not accompanied by a quantitative opinion measure. The classifiers were trained using amazon.com customer reviews. The test results showed that a combination of a h...

متن کامل

Random Forests for multiclass classification: Random MultiNomial Logit

Several supervised learning algorithms are suited to classify instances into a multiclass value space. MultiNomial Logit (MNL) is recognized as a robust classifier and is commonly applied within the CRM (Customer Relationship Management) domain. Unfortunately, to date, it is unable to handle huge feature spaces typical of CRM applications. Hence, the analyst is forced to immerse himself into fe...

متن کامل

A Naïve Hopfield Neural Network based Approach for Multiclass Classification of Customer Loyalty

Customer classification is an area of utmost interest for all businesses. For any organization retaining customer is more important than making new customers. In this paper, a simple idea based on Hopfield Neural Network (HNN) is proposed for multiclass classification of customer loyalty. Initially, transformation and k-medoid clustering algorithm preprocesses the training example dataset. Then...

متن کامل

Grid Base Classifier in Comparison to Nonparametric Methods in Multiclass Classification

In this paper, a new method known as Grid Base Classifier was proposed. This method carries the advantages of the two previous methods in order to improve the classification tasks. The problem with the current lazy algorithms is that they learn quickly, but classify very slowly. On the other hand, the eager algorithms classify quickly, but they learn very slowly. The two algorithms were compare...

متن کامل

Semi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk

This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014